A Novel Sanitization Approach for Privacy Preserving Utility Itemset Mining

نویسندگان

  • R. R. Rajalaxmi
  • A. M. Natarajan
چکیده

Data mining plays a vital role in today’s information world wherein it has been widely applied in various business organizations. The current trend in business collaboration demands the need to share data or mined results to gain mutual benefit. However it has also raised a potential threat of revealing sensitive information when releasing data. Data sanitization is the process to conceal the sensitive itemsets present in the source database with appropriate modifications and release the modified database. The problem of finding an optimal solution for the sanitization process which minimizes the non-sensitive patterns lost is NP-hard. Recent researches in data sanitization approaches hide the sensitive itemsets by reducing the support of the itemsets which considers only the presence or absence of itemsets. However in real world scenario the transactions contain the purchased quantities of the items with their unit price. Hence it is essential to consider the utility of itemsets in the source database. In order to address this utility mining model was introduced to find high utility itemsets. In this paper, we focus primarily on protecting privacy in utility mining. Here we consider the utility of the itemsets and propose a novel approach for sanitization such that minimal changes are made to the database with minimum number of non-sensitive itemsets removed from the database.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Privacy Preserving Frequent Itemset Mining by Reducing Sensitive Items Frequency using GA

Frequent Itemset mining extracts novel and useful knowledge from large repositories of data and this knowledge is useful for effective analysis and decision making in telecommunication networks, marketing, medical analysis, website linkages, financial transactions, advertising and other applications. The misuse of these techniques may lead to disclosure of sensitive information. Motivated by th...

متن کامل

Fast algorithms for hiding sensitive high-utility itemsets in privacy-preserving utility mining

High-Utility Itemset Mining (HUIM) is an extension of frequent itemset mining, which discovers itemsets yielding a high profit in transaction databases (HUIs). In recent years, a major issue that has arisen is that data publicly published or shared by organizations may lead to privacy threats since sensitive or confidential informationmay be uncovered by data mining techniques. To address this ...

متن کامل

Data sanitization in association rule mining based on impact factor

Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...

متن کامل

Privacy Preserving Utility Mining Using Sanitization Approach

This thesis is basically designed for privacy preserving utility mining using sanitization approach. In this work itemsets are provided safety using an approach , firstly we will calculate the utility of all itemsets as the product of item cost and its number of transactions, then we will set a threshold utility which will be the average of max and min utility. Now, we will try to reduce the di...

متن کامل

Privacy and Utility Preserving Task Independent Data Mining

Today’s world of universal data exchange, there is a need to manage the risk of unintended information disclosure. Publishing the data about the individuals, without revealing sensitive information about them is an important problem. K-anonymization is the popular approach used for data publishing. The limitations of Kanonymity were overcome by methods like L-diversity, T-closeness, (alpha, K) ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer and Information Science

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2008